Skip to content

feat: unify ArksApplication on RBG v0.6.0#79

Open
LikiosSedo wants to merge 5 commits intoscitix:mainfrom
LikiosSedo:pr-unified-rbg060-clean
Open

feat: unify ArksApplication on RBG v0.6.0#79
LikiosSedo wants to merge 5 commits intoscitix:mainfrom
LikiosSedo:pr-unified-rbg060-clean

Conversation

@LikiosSedo
Copy link
Contributor

Summary

  • upgrade Arks to work with RBG v0.6.0
  • evolve ArksApplication into the unified inference entrypoint
  • support mode=unified|disaggregated in ArksApplication
  • support router in unified mode
  • keep ArksDisaggregatedApplication as the legacy-compatible path
  • align disaggregated ArksApplication top-level status summary

Validation

Validated on test cluster:

  • old ArksDisaggregatedApplication resources survive upgrade from arks 0.2.2 + rbg050 alpha4 to arks_pr + rbg060
  • old PD resources remain runnable and updatable after upgrade
  • new ArksApplication(mode=disaggregated) can be created after upgrade
  • old and new resources can coexist and update independently

Notes

Not included in this PR:

  • local e2e-only changes
  • validation docs
  • temporary unified command override used only for local mock testing

Not fully runtime-validated on the current test cluster:

  • unified single-node
  • unified distributed
  • unified + router

Reason:

  • the clean PR branch intentionally removed the temporary unified command override
  • the current test cluster has no GPU and no confirmed CPU-capable internal runtime image

刘森栋 and others added 5 commits February 3, 2026 15:44
- Adapt to RBG v0.5.0 pointer type API changes
- Add CoordinationPolicy with Scaling and RollingUpdate strategies
- Scaling: coordinated initial deployment and scale-up
- RollingUpdate: coordinated rolling updates (Issue #150)
- CoordinationPolicy is independent of PodGroupPolicy
- Switch rbg dependency from internal GitLab to official GitHub v0.6.0
- Adapt RoleSpec.Template to RoleSpec.TemplateSource.Template for KEP-8
  RoleTemplate support in arksapplication and arksdisaggregatedapplication
  controllers
Disaggregated mode implicitly requires a router, which only supports
sglang. Without this check, a user could create mode=disaggregated
with runtime=vllm and pass validation, only to hit a runtime error
deep in reconciliation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant